Shareware Grab Bag

home *** CD-ROM | disk | FTP | other *** search

/ Shareware Grab Bag / Shareware Grab Bag.iso / 007 / a86v311c.arc / 11MACRO.DOC < prev next >

Wrap

Text File | 1987-09-25 | 27KB | 683 lines

CHAPTER 11 MACROS AND CONDITIONAL ASSEMBLY 11-1 Macro Facility ----- -------- A86 contains an easy-to-use, but very powerful macro facility. The facility subsumes the capabilities of most assemblers, including operand concatenation, indefinite repeat (often called IRP), and indefinite-repeat character (IRPC). Unlike other assemblers, A86 integrates these functions into the main macro facility; so they can be invoked without clumsy syntax, or strange characters in the macro-call operands. Simple Macro Syntax All macros must be defined before they are used. A macro definition consists of the name of the macro, followed by the word MACRO, followed by the text of the macro, followed by #EM, which marks the end of the macro. Many assembly languages require a list of dummy operand-names to follow the word MACRO. A86 does not: the operands are denoted in the text with the fixed names #1, #2, #3, ... up to a limit of #9, for each operand in order. If there is anything following the word MACRO, it is considered part of the macro text. Examples: ; CLEAR sets the register-operand to zero. CLEAR MACRO SUB #1,#1 #EM CLEAR AX ; generates a SUB AX,AX instruction CLEAR BX ; generates a SUB BX,BX instruction ; MOVM moves the second operand to the first operand. Both operands can be ; memory-variables. MOVM MACRO MOV AL,#2 MOV #1,AL #EM VAR1 DB ? VAR2 DB ? MOVM VAR1,VAR2 ; generates MOV AL,VAR2 followed by MOV VAR1,AL 11-2 Formatting in macro definitions and calls The format of a macro definition is flexible. If the macro text consists of a single instruction, the definition can be given in a single line, as in the CLEAR macro given above. There is no particular advantage to doing this, however: the assembler prunes all unnecessary spaces, blank lines, and comments from the macro text before entering the text into the symbol table. I recommend the more spread-out format of the MOVM macro, for program readability. All special macro-operators within a macro definition begin with a hash-sign # (a hex 23 byte). The letters following the hash- sign can be given in either upper-case or lower-case. Hash-sign operators are recognized even within quoted strings. If you wish the hash-sign to be treated literally, and not as the start of a special macro-operator, you must give 2 consecutive hash signs: ##. For example: FOO MACRO DB '##1' DB '#1' #em FOO abc ; produces DB '#1' followed by DB 'abc' The format of the macro call line is also flexible. A macro call consists of the name of the macro, followed by the operands to be plugged into the macro. The assembler prunes leading and trailing blanks from the operands of a macro call. The operands to a macro call are always separated by commas. Also, as in all assembler source lines, a semi-colon occurring outside of a quoted-string is the start of a comment, ignored by the assembler. If you want to include commas, blanks, or semi-colons in your operands, you must enclose your operand in single-quotes. Macro operand substitution Some macro assemblers expect the operands to macro calls to follow the same syntax as the operands to instructions. In those assemblers, the operands are parsed, and reduced to numeric values before being plugged into the macro definition text. This is called "passing by value". A86 does not pass by value, it passes by text. The only parsing of operands done by the macro processor is to determine the start and the finish of the operand text. That text is substituted, without regard for its contents, for the "#n" that appears in the macro definition. The text is interpreted by the assembler only after a complete line is expanded and as it is assembled. 11-3 If the first non-blank character after the macro name is a comma, then the first operand is null: any occurrences of #1 in the macro text will be deleted, and replaced with nothing. Likewise, any two consecutive commas with no non-blanks between them will result in the corresponding null operand. Also, out-of-range operands are null; for example, #3 is a null operand if only two operands are provided in the call. Null operands to macros are not in themselves illegal. They will produce errors only if the resulting macro expansion is illegal. The method of passing by text allows operand-text to be plugged anywhere into a macro, even within symbol names. For example: ; KF_ENTRY creates an entry in the KFUNCS table, consisting of a ; pointer to a KF_-action-routine. It also declares the ; corresponding CF_-symbol, which is the index within the table ; for that entry. KF_ENTRY MACRO CF_#1 EQU ($-KFUNCS)/2+080 DW KF_#1 #EM KFUNCS: KF_ENTRY UP KF_ENTRY DOWN ; The above code is equivalent to: ; ; KFUNCS: ; DW KF_UP ; DW KF_DOWN ; ; CF_UP EQU 080 ; CF_DOWN EQU 081 Quoted-string operands As mentioned before, if you want to include blanks, commas, or semicolons in your operands, you enclose the operand in single- quotes. In the vast majority of cases in which these special characters need to be part of operands, the user wants them to be quoted in the final, assembled line also. Therefore, the quotes are passed in the operand. To override this, and strip the quotes from the string, you precede the quoted string with a hash-sign. Examples: 11-4 DBW MACRO DB #1 DW #2 #EM DBW 'E', E_POINTER DBW 'W', W_POINTER ; note that if quotes were not passed, the above lines would have ; to be DBW '''E''', E_POINTER; DBW '''W''', W_POINTER GENERAL_PUSH MACRO PUSH#1 #EM GENERAL_PUSH F ; generates a PUSHF instruction GENERAL_PUSH #' AX' ; generates a PUSH AX instruction The fact that I could not come up with a more useful example than GENERAL_PUSH is strong evidence that it is much better to pass the quotes as the default action. Looping by operands in macros This macro facility contains two kinds of loops: you can loop once for each operand in a range of operands; or you can loop once for each character within an operand. The first kind of loop, the R-loop, is discussed in this section; the second kind, the C-loop, is discussed later. An R-loop is a stretch of macro-definition code that is repeated when the macro is expanded. In addition to the fixed operands #1 through #9, you can specify a variable operand, whose number changes each time through the loop. You give the variable operand one of the 4 names #W, #X, #Y, or #Z. An R-loop begins with #R, followed immediately by the letter W,X,Y, or Z naming the variable, followed by the number of the first operand to be used, followed by the number of the last operand to be used. After the #Rxnn is the text to be repeated. The R-loop ends with #ER. For example: STORE3 MACRO MOV AX,#1 #RY24 ; "repeat for Y running from 2 through 4" MOV #Y,AX #ER #EM STORE3 VAR1,VAR2,VAR3,VAR4 ; the above call produces the 4 instructions MOV AX,VAR1; MOV VAR2,AX; ; MOV VAR3,AX; MOV VAR4,AX. 11-5 The #L last operator and indefinite repeats The macro facility recognizes the special operator #L, which is the last operand in a macro call. #L can appear anywhere in macro text; but its big power occurs in conjunction with R-loops, to yield an indefinite-repeat facility. A common example is as follows: you can take any macro that is designed for one operand, and easily convert it into a macro that accepts any number of operands. You do this by placing the command #RX1L, "repeat for X running from 1 through L", at the start of the macro, and the command #ER at the end just before the #EM. Finally, you replace all instances of #1 in the macro with #X. We see how this works with the CLEAR macro: CLEAR MACRO #RX1L SUB #X,#X #ER #EM CLEAR AX,BX ; generates both SUB AX,AX and SUB BX,BX in one macro! It is possible for R-loops to iterate zero times. In this case, the loop-text is skipped completely. For example, CLEAR without any operands would produce no expanded text. Character-loops We have seen the R-loop; now we discuss the other kind of loop in macros, the character-loop, or C-loop. In the C-loop, the variable W,X,Y, or Z does not represent an entire operand; it represents a character within an operand. You start a C-loop with #C, followed by one of the 4 letters W,X,Y, or Z, followed by a single operand-specifier. Following the #Cxn is the text of the C-loop. The C-loop ends with #EC. The macro will loop once for every character in the operand. That single character will be substituted for each instance of the indicated variable-operand. For example: PUSHC MACRO #CW1 PUSH #WX #EC#EM PUSHC ABC ; generates the 3 instructions PUSH AX; PUSH BX; PUSH CX If the C-operand is quoted in the macro call, the quotes ARE removed from the operand before passing characters to the loop. It is not necessary to precede the quoted string with a hash- sign in this case. If you do, the hash-sign will be passed as the first character. If the C-operand is a null operand (no characters in it), the loop-text is skipped completely. 11-6 The "B"-before and "A"-after operators So far, we have seen that you can specify operands in your macro in fourteen different ways: 1,2,3,4,5,6,7,8,9,W,X,Y,Z,L. We now multiply these 14 possibilities, by introducing the "A" and "B" operators. You can precede any of the 14 specifiers with "A" or "B", to get the adjacent operand after or before the specified operand. For example, BL means the operand just before the last operand; in other words, the second-to-the-last operand. AZ means the operand just after the Z operand. You can even repeat, up to a limit of 4 "B"s or 3 "A"s: BBL is the third-to-last operand; #AAA9 can be used where you would want to (but cannot) use #12. In the case of the variable operand to a C-loop, the "A" and "B" specifiers denote the characters before or after the current looping-character. An example of this is given in the next section. Multiple-increments within loops We have seen that you end an R-loop with a #ER, and you end a C- loop with a #EC. We now present another way to end these loops; a way that lets you specify a larger increment to the macro's loop-counter. You can end your loops with one of the 4 additional commands #E1, #E2, #E3, or #E4. For R-loops terminated by #ER, the variable-operand advances to the next operand when the loop is made. If you end your R-loop with #E2, the variable-operand advances 2 operands, not just one. For #E3, it advances 3 operands; for #E4, 4 operands. The #E1 command is the same as #ER. The most common usage of this feature is as follows: You will recall that we generalized the CLEAR macro with an R-loop, so that it would take an indefinite number of operands. Suppose we want to do the same thing with the DBW macro. We would like DBW to take any number of operands, and alternate DBs and DWs indefinitely on the operands. This is made possible by creating an R-loop terminated by #E2: DBW MACRO #RX1L DB #X DW #AX #E2 #EM DBW 'E',E_POINTER, 'W',W_POINTER ; two pairs on same line! The #E2 terminator means that we are looping on a pair of operands. Note the crucial usage of the "A"-after operator to specify the second operand of the operand-pair. 11-7 A special note applies to the DBW macro above: the assembler just happens to accept a DW directive with no operands (it generates no object code, and issues no error). This means that DBW will accept an odd number of operands with no error, and do the expected thing (it alternates bytes and words, ending with a byte). You could likewise generalize a macro with 3 or 4 operands, to an indefinite number of triples or quadruples; by ending the R-loop with #E3 or #E4. The operands in each group would be specified by #X, #AX, #AAX, and, for #E4, #AAAX. For C-loops terminated by #E1 through #E4, the character-pointer is advanced the specified number of characters. You use this in much the same way as for R-loops, to create loops on pairs, triplets, and quadruplets of characters. For example: PUSHC2 MACRO #CZ1 PUSH #Z#AZ #E2 #EM PUSHC2 AXBXSIDI ; generates PUSH AX; PUSH BX; PUSH SI; PUSH DI Negative R-loops We now introduce another form of R-loop, called the Q-loop-- the negative repeat-loop. This loop is the same as the R-loop, except that the operand number decrements instead of increments; and the loop exits when the number falls below the finish-number, not above it. The Q-loop is specified by #Qxnn instead of #Rxnn, and #EQ instead of #ER. You can also use the multiple-decrement forms #E1 #E2 #E3 or #E4 to terminate an Q-loop. Example: MOVN MACRO #QXL2 ; "negative-repeat X from L down to 2" MOV #BX,#X #EQ#EM MOVN AX,BX,CX,DX ; generates the three instructions: ; MOV CX,DX ; MOV BX,CX ; MOV AX,BX Note: the above functionality is already built into the MOV instruction of the assembler. The macro shows how you would implement it if you did not already have this facility. Nesting of loops in macros This macro facility allows nesting of loops within each other. Since we provide the 4 identifiers W,X,Y,Z for the loop-operands, you can nest to a level of 4 without restriction-- just use a different letter for each nesting level. You can nest even deeper, subject to the restriction that a letter W,X,Y,Z refers to the innermost containing loop that defines it. 11-8 Implied closing of loops If you have a loop or loops ending when the macro ends, and if the iteration count for those loops is 1, you may omit the #ER, #EC, or #EQ. The assembler closes all open loops when it sees #EM, with no error. For example, if you omit the #ER for the loop-version of the CLEAR macro, it would make no difference-- the assembler automatically places an #ER code into the macro definition for you. Local labels in macros Some assemblers have a LOCAL pseudo-op that is used in conjunction with macros. Symbols declared LOCAL to a macro have unique (and bizarre) symbol-names substituted for them each time the macro is called. This solves the problem of duplicate label definitions when a macro is called more than once. In A86, the problem is solved more elegantly, by having a class of generic local labels throughout assembly, not just in macros. Recall that symbols consisting of a single letter, followed by one or more decimal digits, can be redefined. You can use such labels in your macro definitions. I have recommended that local labels outside of macros be designated L1 through L9. Within macro definitions, I suggest that you use labels M1 through M9. If you used an Ln-label within a macro, you would have to make sure that you never call the macro within the range of definition of another Ln-label with the same name. By using Mn-labels, you avoid such potential conflicts. The following example of a local label within a macro is taken from the source of the macro-processor itself: 11-9 ; "JHASH label" checks to see if AL is a hash sign. If it is, ; it processes the hash-sign term, and jumps to label. ; Otherwise, it drops through to the following code. JHASH MACRO CMP AL,'##' ; is the scanned character a hash-sign? JNE >M1 ; skip if not CALL MDEF_HASH ; process the hash sign JMP #1 ; jump to the label provided M1: #EM ... L3: ; loop here to eat empty lines, leading blanks CALL SKIP_BLANKS ; skip over the leading blanks of a line INC SI ; advance source ptr beyond the next non-blank JHASH L3 ; if hash-sign then process, and eat more blanks CMP AL,0A ; were the blanks terminated by a linefeed? JE L3 ; loop if yes, nothing on this line L5: ; loop here after a line is seen to have contents CMP AL,';' ; have we reached the start of a comment? JE L1 ; jump if yes, to consume the comment JHASH >L6 ; if hash-sign then process it; get next char ... L6: LODSB ; fetch the next definition-char from the source CMP AL,' ' ; is it blank? JA L5 ; loop if not, to process it ... Debugging macro expansions There is a tool called EXMAC which will help you troubleshoot program lines that call macros. If you are not sure about what code is being generated by your macro calls, EXMAC will tell you. See Chapter 13 for details. 11-10 Conditional Assembly ----------- -------- A86 has a conditional assembly feature, that allows you to specify that blocks of source code will or will not be assembled, according to the values of equated user symbols. The controlling symbols can be declared in the program (and can thus be the result of assembly-time expressions), or they can be declared in the assembler invocation. You should keep in mind the difference between conditional assembly, invoked by #IF, and the structured-programming feature, invoked by IF without the hash-sign. #IF tests a condition at assembly-time, and can cause code to not be assembled and thus not appear in the program. IF causes code to be assembled that tests a condition at run-time, possibly jumping over code. The skipped code will always appear in the program. All conditional assembly lines are identified by a hash-sign # as the first non-blank character of a line. The hash-sign is followed by one of the four keywords IF, ELSEIF, ELSE or ENDIF. #IF starts a conditional-assembly block. On the same line, following the #IF, you provide a name. If the name is undefined, or if it has been equated to zero, then the following lines of code are skipped, up to the next matching #ELSEIF, #ELSE, or #ENDIF. If the name is non-zero, then the following lines of code are assembled normally. If a subsequent matching #ELSEIF or #ELSE is encountered, then code is skipped up to the matching #ENDIF. #ELSEIF provides a multiple-choice facility for #IF-blocks. You can give any number of #ELSEIFs between an #IF and its matching #ENDIF. Each #ELSEIF has a name following it on the same line. If the name following the #IF has zero value, then the assembler looks for the first non-zero name following an #ELSEIF, and assembles that block of code. If there are no non-zero #ELSEIFs, then the #ELSE-block (if there is one) is assembled. It is legal to provide an undefined name after #IF or #ELSEIF. The name is interpreted as being false (zero), with no error. You may precede the name in an #IF or #ELSEIF line with an exclamation point "!", which acts as a NOT-operator: code will be skipped if the name is non-zero instead of zero. #ELSE marks the beginning of code to be assembled if all the previous blocks of an #IF have been skipped over. There is no operand after the #ELSE. There can be at most one #ELSE in an #IF-block, and it must appear after any #ELSEIFs. #ENDIF marks the end of an #IF-block. There is no operand after #ENDIF. It is legal to have nested #IF-blocks; that is, #IF-blocks that are contained within other #IF-blocks. #ELSEIF, #ELSE, and #ENDIF always refer to the innermost nested #IF-block. 11-11 As an example of conditional assembly, suppose that you have a program that comes in three versions: one for Texas, one for Oklahoma, and one for the rest of the nation. The three programs differ in a limited number of places. Instead of keeping three different versions of the source code, you can keep one version, and use conditional assembly on the boolean variables TEXAS and OKLAHOMA to control the assembler output. A sample block would be: #if TEXAS DB 0,1,2,3 #elseif OKLAHOMA DB 4,5,6,7 #else DB 8,9,10,11 #endif If a block of code is to be assembled only if TEXAS is false, then you would use the exclamation-point operator: #if !TEXAS DB 0FF #endif Conditional Assembly and Macros You may have conditional-assembly blocks either in macro- definitions or in macro expansions. The only limitation is that if you have an #IF-block in a macro expansion, the entire block (i.e., the matching #ENDIF) must appear in the same macro expansion. You cannot, for example, define a macro that is a synonym for #IF. To have your conditional-assembly block apply to the macro definition, you provide the block normally within the definition. For example: X1 EQU 0 BAZ MACRO #if X1 DB 010 #else DB 011 #endif #EM BAZ X1 EQU 1 BAZ In the above sequence of code, the conditional-assembly block is acted upon when the macro BAZ is defined. The macro therefore consists of the single line DB 011, with all the conditional- assembly lines removed from the definition. Thus, both expansions of BAZ produce the object-code byte of 011, even though the local label X1 has turned non-zero for the second invocation. 11-12 To have your conditional-assembly block appear in the macro expansion, you must literalize the hash-sign on each conditional-assembly line by giving two hash-signs: X1 EQU 0 BAZ MACRO ##if X1 DB 010 ##else DB 011 ##endif #EM BAZ X1 EQU 1 BAZ Now the entire conditional-assembly block is stored in the macro definition, and acted upon each time the macro is expanded. Thus, the two invocations of BAZ will produce the different object bytes 011 and 010, since X1 has become non-zero for the second expansion. You will usually want your conditional-assembly blocks to be acted upon at macro-definition time, to save symbol-table space. You will thus use the first form, with the single hash-signs. Conditional Assembly and the XREF Program In most cases, the XREF program will recognize conditional- assembly blocks, and ignore skipped-code in its XREF compilation. The last macro example above, however, is an example in which XREF will not skip the same blocks that the assembler will; because it falls under the following WARNING: The XREF program will use the value of all symbols as it existed at the end of assembly. XREF does not parse statements that change the value of local variables! Thus, if you have conditional assembly based on a variable whose value changes during assembly, XREF will compile different source than the assembler assembled. The above warning does not apply to invocation-variables, described below. If you wish to change the value of a conditional-control variable during assembly, and if you wish XREF to give accurate results, you should change the variable between file-names in the invocation, as described below. 11-13 Declaring Variables in the Assembler Invocation To facilitate the effective use of conditional assembly, this assembler allows you to declare boolean (true-false) symbols in the command-line that invokes the assembler. The declarations can appear anywhere in the list of source file names. They are distinguished from the file names by a leading equals-sign =. To declare a symbol TRUE (value = 1), give the name after the equals-sign. DO NOT put any spaces between the equals-sign and the name! To declare a symbol FALSE (value = 0), you can give an equals-sign, an exclamation-point, then the name. Again, DO NOT embed any blanks! Example: if your source files are src1.8, src2.8, and src3.8, then you can assemble with TEXAS true by invoking the assembler as follows: a86 =TEXAS src1.8 src2.8 src3.8 You can assemble with TEXAS explicitly set to FALSE as follows: a86 =!TEXAS src1.8 src2.8 src3.8 Note that if TEXAS is used only as a conditional-assembly control, then you do not need to include the =!TEXAS in the invocation, because an undefined TEXAS will automatically be interpreted as false. Null Invocation Variable Names The assembler will ignore an equals-sign by itself in the invocation line, without error. This allows you to generate assembler-invocation lines using parameters that could be either boolean-variable-names, or null strings. For example, in the previously-mentioned TEXAS-OKLAHOMA-nation example, the program could be invoked via a .BAT file called "AMAKE.BAT", coded as follows: A86 =%1 *.8 You invoke the assembler by typing one of the following: amake texas amake oklahoma amake The third line will produce the assembler-invocation A86 = *.8; causing no invocation-variables to be declared. Thus both TEXAS and OKLAHOMA will be false, which is exactly what you want for the rest-of-the-nation version of the program. 11-14 Changing Values of Invocation Variables The usual prohibition against changing the value of a symbol that is not a local-label does not apply to invocation-variables. For example, suppose you have a conditional-control variable DEBUG, which will generate diagnostic code for debugging when it is true. Suppose further that you have already debugged source files src1.8 and src3.8; but you are still working on src2.8. You may invoke the assembler as follows: A86 src1.8 =DEBUG src2.8 =!DEBUG src3.8 The variable DEBUG will be TRUE only during assembly of src2.8, just as you want.